Chinese
Service Area| About Us| Investigation procedure| Payment| Contact Us|
Attorney Servic|Business Services|Insurance Services|Individual Services|
Company dynamics|Industry news|Detective knowledge|
China Investigation
Service Search

Position:Home > News > Detective knowledge >
In Search of Indicators of Detective Aptitude: Police Recruits’ Logical Reasoning and Ability to Generate Investigative Hypotheses

Previous psychological research on criminal investigation has not systematically addressed the role of deductive and inductive reasoning skills in decision-making in detectives. This study examined the relationship between these skills derived from a cognitive ability test used for police recruitment and test scores from an investigative reasoning skills task (Fahsing and Ask 2016). Newly recruited students at the Norwegian Police University College (N = 166) were presented with two semi-fictitious missing-person cases and were asked to report all relevant hypotheses and necessary investigative actions in each case. The quality of participants’ responses was gauged by comparison with a gold standard established by a panel of senior police experts. The scores from the deductive and inductive reasoning test were not related to participants’ performance on the investigative reasoning task. However, the presence or absence of an investigative “tipping-point” (i.e. arrest decision) in the two cases was systematically associated with participants’ ability to generate investigative hypotheses. Methodological limitations and implications for police recruitment and criminal investigative practice are discussed.

Introduction

The investigation of complex crime has been identified as a risky decision scene with potentially fatal consequences for victims, suspects and ultimately the legitimacy of the rule of law (Rossmo 2009). Still, the most recent evidence regarding the criminal investigations process indicates that detective work has remained essentially unchanged since Rand’s landmark study of detectives conducted during the 1970s (Greenwood et al. 1977; Horvath et al. 2001; Kingshott et al. 2015; Tong 2009). Detectives seem to struggle with the same hurdles as they did nearly 40 years ago; there is a need for better recruitment and improved training, caseloads are too large, and detectives are pressed for time to spend on solvable cases. Based on these findings, it appears that efforts to improve detective performance thus far have had limited success. The investigation of complex crimes depends strongly on the detective’s decision-making abilities (Irvine and Dunningham 1993; Westera et al. 2014b). As stated by Stelfox (2008), “The service’s capacity to carry out investigations comprises almost entirely the expertise of investigators” (p. 303). Despite this, the mental capacities required to carry out successful investigations have not been subjected to much attention in psychology or any other domain (Fahsing and Gottschalk 2008; O’Neill and Milne 2014; Smith and Flanagan 2000; Westera et al. 2014a). In order to reduce the risk of tunnel vision and to effectively operationalise the legal burden of proof, detectives must be able to apply abductive logic in all phases criminal investigations (Fahsing and Ask 2016). Abduction, also called the logic of discoveries, is the cognitive process of identifying the best possible explanations for a set of observations (Josephson and Josephson 1994; Peng and Reggia 1990). Examples of tasks characterised by abductive logic include medical diagnosis (Feltovich et al. 1984), scientific discovery (Thagard 1989) and discourse comprehension (Kintsch 1988). Hence, the ability to think abductively in critical situations and to be able to put into practice such an approach in both the identification of case theories and the search for evidence is a fundamental reasoning skill for any detective (Carson 2011; Hald 2011; Innes 2003; Rønn 2013). The aim of the present study is to experimentally test the degree to which a general cognitive ability test used in personnel selection by the Norwegian Police University College can predict such investigative decision-making skills. More specifically, to what degree do deductive and inductive thinking abilities predict the ability to think abductively in criminal investigations?

Guilt Assumptions and Systemic Tunnel Vision

Although fictional myths often portray detectives’ as born rational masterminds, the available research on actual detectives’ decision-making give a somewhat different picture (see eg., Ask and Alison 2010; Fahsing and Ask 2016; Greenwood et al. 1977; Hallenberg et al. 2016). Instead of establishing in an unbiased fashion what might have happened, detectives’ often at a very early stage of an investigation are primed to focus in on the most likely suspect, and from then on investigate against him (Hill et al. 2008; Kassin et al. 2003). Their ability to later adjust this focus is generally found to be low, even when the available evidence changes quite dramatically (Ask and Granhag 2005; Ask et al. 2008). The use of such early “case theories” based on the initial suspicion is probably inevitable and most of the time probably quite helpful. Reasonable grounds for suspicion are after all a legal requirement to start up most criminal investigations, and most crimes are solved more or less on the spot by persons already know by the police (Stelfox 2009). On the other hand, fundamental principles of human rights direct any detective to presume all suspects innocent until proven guilty in court (Stumer 2010). The problem is therefore not the use of previous knowledge or the suspicion as such, but rather the restricted scope of alternatives that is considered. This is not at all unique to the field criminal investigation—all complex investigative tasks are at risk of suffering from cognitive shortcomings and biases (Gilovich et al. 2002; Guyer and Wood 1998; Koehler 1991).

Such fixed mind-sets or biased cognitive capacity to identify all plausible alternatives before commencing an active evaluation and integration of information to arrive at a choice is a classic problem when dealing with fact-finding both within and outside of the criminal procedure. Tversky and Kahneman (1974) noted that cognitive bias effects could extend to the legal system insofar as “beliefs concerning the likelihood of... the guilt of a defendant” could impact judicial decision-making (p. 1124). This problem is reinforced by the professional culture of the police and the prosecution service where a premium is put on swift convictions (Brodeur 2010; Carson 2007; Knutsson 2013; Maguire 1994). A growing body of research has begun to identify the ways in which such biases can pervade the entire chain of justice from eyewitnesses (Hasel and Kassin 2009; Loftus and Ketcham 1991) to detectives (Ask and Alison 2010; Gudjonsson 1995; Meissner and Kassin 2002), forensic experts (Dror 2011; Dror et al. 2005), jurors (Charman et al. 2009; Georges et al. 2013) and professional judges (Granhag et al. 2005; Hasel and Kassin 2009). In sum, this might lead to a chain reaction in the criminal justice process, denoted by Dror (2012) as the “bias snowball effect”. Likewise, Findley and Scott (2006), describe this as a form of systemic tunnel vision created through a “compendium of common heuristics and logical fallacies”, to which we are all susceptible. This leads actors in the criminal justice system to “focus on a suspect, select and filter the evidence that will ‘build a case’ for conviction, while ignoring or suppressing evidence that points away from guilt” (p. 292).

Such an orientation towards positive testing strategies favours a somewhat naïve inductivism, where pieces of evidence that points towards the guilt of a suspect is seen as more valuable than those that would “falsify” the same hypothesis (Popper 2002; Wason 1960). This is perhaps why much of the previous experimental research on detectives’ decision-making has used detective ratings of a suspect’s guilt as the dependent variable (e.g. Ask and Granhag 2005). This is not ideal for two reasons: (1) it is not in the professional role of a detective to decide on guilt, and (2) making such a rating may trigger a premature guilt assumption that was not necessarily there before. Ideally, (legal) decision-makers should be flexible, creative and adaptive throughout the entire process, since the reconstruction of preferences seems to be the natural outcome of the very process of decision-making (Janis and Mann 1977; Montgomery 1983).

Hence, detectives should have the necessary cognitive flexibility to adapt to the changes in a task or situation. Krems (1995) suggests three task-dependent mechanisms for flexible problem solving: the abilities to (1) consider alternative interpretations of the data, (2) modify their representation of the current situation and (3) change strategies to reflect the changes in the situation. Similarly, a systematic consideration of all the competing options has been found to facilitate more thorough and qualitatively better judgments (Hirt and Markman 1995). In a study of official reviews of how British Senior Investigative Officers (SIOs) seem to think and decide in murder investigations, Jones et al. (2008) describe how they first identify and record the relevant competing hypotheses that can be identified in the case. Then, they seek to disprove each one; the one that remains after such attempts at falsification is probably the strongest theory. Although we are not aware of any studies testing the effectiveness of such an approach among experienced detectives, O’Brien (2009) showed that mock investigators (college students) became less biased in their information search and inferences following the systematic consideration of each hypothesis.

In an attempt to investigate how differences in training and education may influence investigators’ decision-making skills, Fahsing and Ask (2016) found that English and Norwegian police officers differed significantly in their ability to generate relevant investigative hypotheses and investigative actions. Moreover, drawing on the work by Gollwitzer (1990) describing the effects of mind-set changes in different stages of goal-directed behaviour, Fahsing and Ask (2013) found that highly experienced detectives identified the decision to make an arrest as the most significant “tipping-point” in a criminal investigation. That is, after a decision to arrest a suspect has been made, according to detectives, it becomes more difficult to maintain an open mind and consider multiple hypotheses. Hence, a decisional tipping-point is any judgement or decision with a potential to trigger a shift in detectives’ mind-set—from deliberation to implementation. This issue was investigated by Fahsing and Ask (2016), who manipulated the presence of potential tipping-points in fictive cases presented to both novice and experienced investigators, but did not observe any effect of tipping-points on participants’ ability to generating alternative investigative hypotheses. The present study will utilise the same design in order to test the abductive ability of Norwegian police recruits in an investigative setting as well as the robustness of this ability when exposed to a real-life investigative tipping-point—the decision to arrest a suspect.

Investigation as Abductive Reasoning

The best place to begin explaining the concept of abductive reasoning is to describe what are usually taken to be the success criteria for different types of logical reasoning. Logical reasoning is the process of using existing knowledge to draw conclusions, make predictions or construct explanations (Staat 1993). Deductive reasoning starts with the assertion of a general rule and proceeds from there via premises to a guaranteed specific conclusion. It leads downwards; if the original assertion is true, then the conclusion must also be true. Inductive reasoningtypically begins with observations that are specific and limited in scope, and leads inwards to a generalised conclusion that is highly probable, but not necessarily logically true, in light of the observed evidence. Inductive reasoning moves from the specific to the general. Abductive reasoning begins with an incomplete set of observations and leads outwards to find the most likely explanation for the set. Abduction therefore involves the creative process of inferring the most probable hypotheses that might explain a phenomenon or reveal some (possibly new) observation. A medical diagnosis is a typical application of abductive reasoning: given a set of symptoms, what is the diagnosis that would best explain most of them? Sometimes the symptoms are so ambiguous or hard to test that a clear diagnosis is not possible. Then the abductive process can be used to (1) generate the best guesses based on what is known and (2) evaluate the competing hypotheses to find the best diagnosis (Josephson and Josephson 1994). We can call both inferences ampliative, selective and creative, because in both cases the reasoning involved amplifies, or goes beyond, the information incorporated in the premises. Therefore, abduction entails both a logical and analytic dimension closely intertwined with a more creative and synthetic dimension (Patokorpi 2006).

Likewise, if there are reasonable grounds to suspect that someone is guilty of a crime, an investigation in form of an abductive process should commence and (a) identify what hypotheses are best supported by the available information and (b) identify, secure, document and cross-check information from sources of information with potential to discriminate between the competing hypotheses (Innes 2003; Jackson 1988). Ideally, the investigative process should continue until (only) one explanation stands out as the best, and if this can be proven beyond reasonable doubt, the case should be prosecuted. Similarly, the burden of proof in criminal cases is typically met if the investigation can rule out all reasonable alternatives to guilt beyond reasonable doubt (Klamberg 2015). The expression beyond (reasonable doubt)1indicates that the hypothesis of guilt must be proven to go beyond all other reasonable alternative hypotheses put forward in the case. As stated by Zuckerman and Roberts (2010); “the fact-finder have to follow a mental procedure of progressive elimination of explanation consistent with innocence” (p. 134). On the other hand, the information gathered in criminal investigations is often both ambiguous and fragile (Dror and Cole 2010). This is only one out of several reasons that makes true elimination impossible in criminal investigations and trial. Therefore, the burden of proof cannot be understood as a truly deductive elimination of all possible alternatives to guilt (Jackson 1988). The available data simply does not always allow for this (Innes 2003); hence, criminal investigation can be understood as a pragmatic and abductive process aiming to identify the most reasonable set of competing explanations based on the available information (Brodeur 2010; Diesen 2000).

Cognitive Ability and Detective’s Job Performance

Cognitive or mental ability is one of the most widely researched psychological constructs (Weisberg and Reeves 2013). The U.S. Military was the first to apply large-scale ability testing, assessing almost two million individuals during World War I (Mandler 2007). Although there is some inconsistency among definitions of what cognitive ability tests actually measure, it can generally be defined as the conscious ability to learn (Hunter 1986). This sort of “intelligence testing” has become the dominant method for personnel selection in a wide range of public and private sector organizations (Domino and Domino 2006). Although its generalisability has been questioned (Neisser et al. 1996; Schlinger 2003; Stanovich and West 2014a), is seems that cognitive ability is one of the most reliable predictors of overall job performance (Aamodt 2004; Hirsh et al. 1986). Cognitive ability has been found to relate to performance by facilitating facts-acquisition, learning procedures, problem solving and job-specific rules (Motowildo et al. 1997)—all thought to be highly important in law-enforcement selection (Ono et al. 2011). Typically, in performance evaluation studies, cognitive ability accounts for about 25% of the variance in performance measures (Motowidlo 2003; Ono et al. 2011).

A related question is to what degree cognitive ability promotes rational thinking in complex real-life decision-making. At least two theoretical positions strongly suggest that intelligence and complex problem solving are related. First, the ability to solve problems features prominently in almost every definition of human “intelligence”; thus, problem-solving capacity is viewed as one component of intelligence. Second, intelligence is often assumed to be a predictor of problem-solving ability (Wenke et al. 2005). A considerable amount of empirical data suggest that specific components of intelligence, such as processing capacity, are related to specific components of complex problem solving. For example, positive correlations have been established between scores on tests of cognitive ability and resistance to typical decision-making biases, such as overconfidence (Stanovich and West 2000), statistical reasoning errors (Nisbett et al. 1983), framing errors (Stanovich and West 1998) and hindsight bias (Stanovich and West 2000). Bruine de Bruin et al. (2007) found that behavioural decision-making tasks, such as estimation of health risks, were related to high scores on tests of fluid intelligence and reading comprehension.

On the other hand, these relationships are somewhat mitigated by the fact that many studies have found that a number of undesirable effects of heuristics and biases seem to operate independently of intelligence (Stanovich 2009; Stanovich and West 1997; Stanovich and West 2000, 2008; Stanovich et al. 2013). Studies done on tasks connected with “scientific” and probabilistic reasoning such as covariation detection, hypothesis testing, disjunctive reasoning and denominator neglect have not been found to support a claim for a strong relationship between intelligence tests and operational rationality (e.g. Bruine de Bruin et al. 2007; Frederick 2005; Stanovich and West 2000). Hence, Stanovich and West (2014b) conclude that a number of rational thinking tasks seem quite dissociated from IQ. The notion that IQ tests do not measure all key human cognitive faculties is not new; critics of intelligence tests have been making that point for years (Neisser et al. 1996; Sternberg 2002). The degree to which intelligence test scores are correlated with overall giftedness, practical intelligence, and rational thinking is therefore also debated (Baron 1985; Chomsky 1972; Gardner 2011; Stanovich and West 2014b).

Moreover, meta-analyses in law-enforcement settings with US samples indicate that cognitive ability has a significantly weaker correlation with work performance than in other occupational groups (Hirsh et al. 1986; Ono et al. 2011). A meta-analysis of European samples revealed similar findings (Salgado et al. 2003). One explanation for the seemingly weak relationship between cognitive ability tests and job performance is the social and interactive nature of most law-enforcement jobs. That is, task, situation, personality, maturity, motivation and interpersonal skills might have larger effects on officers’ job performance than cognitive ability (Hirsh et al. 1986). A study of North-American police officers by Smith and Aamodt (1997) found a significant relationship between job performance and age, experience, motivation and level of education. The study did not control for overall cognitive ability, however. In a survey of British police officers, O’Neill (2011) found no correlations between IQ scores on Ravens Standard Progressive Matrices and objective and subjective measures of investigative success. However, the measures of investigative success in O’Neill’s study were based on quite ambiguous sources such as clearance rates and self-reports. Hence, there are very few published experimental studies on how cognitive ability relates to detectives’ judgments and decision-making. In a rare study including 50 Swedish detectives, Ask and Granhag (2005) found that need for cognitive closure (NFC) moderated confirmation bias during evaluating the strength of the evidence against a prime suspect in a fictitious homicide case. Investigators found to be high (vs. low) in NFC were somewhat more likely to identify exonerating information when it confirmed their hypothesis, but somewhat less likely when the information disconfirmed their hypothesis.

Although lateral thinking, decision-making, and creativity have been identified as crucial for individual job performance in criminal investigations (Fahsing and Gottschalk 2008; Irvine and Dunningham 1993; Smith and Flanagan 2000), neither of these constructs have been defined with precision or consensus in the available literature (Alison et al. 2007; Ask and Alison 2010; Fahsing and Ask 2013). This may account for the weak relationship between cognitive ability tests and investigative performance. Typical measures of criminal investigation success, such as clearance rates, number of arrests and charges, are all quite poor indicators of detective performance and investigative quality (Brodeur 2010; Knutsson 2013; Maguire et al. 1991; O’Neill and Milne 2014). Taken together, the available evidence both from basic and applied research suggests that the global concepts of intelligence and problem solving are not always related, but that specific subcomponents of intelligence and explicit problem solving might share variance. Hence, it seems of vital importance to define some stable and testable measures of detective performance. A crucial question therefore remains: What type of cognitive abilities will promote investigative thinking and how can these abilities be tested in a reliable way?

The Present Research

The aim of the present research was to test to what degree measures of inductive and deductive reasoning skills, used for recruitment to the Norwegian police, can predict recruits’ ability to generate investigative hypotheses and actions, and whether such individual differences moderate recruits’ vulnerability to decisional tipping-points. These skills can be seen as essential for flexible problem solving as described above by Krems (1995) and as a natural starting-point for the application of abductive reasoning in criminal investigations. Participants in the study were presented with two crime scenarios based on real-life missing-person cases. To experimentally manipulate the presence of an decisional tipping-point (Fahsing and Ask 2013), we inserted additional information in one of the cases that a person close to the victim had been arrested. Participants were asked to generate as many relevant investigative hypotheses as possible for each case. The same scenarios and procedure was used by Fahsing and Ask (2016), who gauged the participants’ responses against a so-called gold standard—an exhaustive list of relevant hypotheses for the two cases, generated by an international expert panel of senior homicide detectives. Hence, we were able to assess both the quantity and the quality of participants’ hypothesis and action generation.

Four specific hypotheses were formulated for the study: First, deductive reasoning will be positively correlated with the number of generated hypotheses (H1). This prediction rests on the assumption that the identification of alternatives from a given set of facts rests on the so-called law of detachment (also known as affirming the antecedent and modus ponens), which is the first form of all syllogisms and deductive reasoning (Rips 1994). For example, a murdered person cannot at the same time be alive. Hence, if one cannot confirm that a person is not alive, one cannot conclude murder. Second, inductive reasoning ability will be positively correlated with hypothesis generation (H2), because a core aspect of inductive reasoning is to suggest which logical arguments and phenomena the given premises allow for (Evans 1989). For instance, if a woman is gone missing, one of the inductive probabilities are that she might have been subjected to a crime. Third, participants will generate fewer investigative hypotheses when an investigative tipping-point (i.e. decision to arrest) is present (vs. absent) in the case material (H3), as this is likely to trigger a shift from an open-minded deliberative mind-set to a rigid implemental mind-set (Gollwitzer et al. 1990). Fourth, participants scoring high on inductive (H4a) and deductive (H4b) reasoning will be less vulnerable to the influence of investigative tipping-points compared with participants with lower score. This is predicted because high levels of general logical reasoning capacity should make participants more robust to irrelevant contextual influences (Hunter 1986; Kahneman and Frederick 2002).

Method

Participants

A total of 166 police recruits (106 males, 60 females) from the Norwegian Police University College, representing two different locations in the country voluntarily participated in the study. Participants were first-year students at the time and had no prior policing experience or education as detectives. Their mean age was 23.1 years (SD = 3.2). Fifty-eight (34.9%) had previous higher education, and 39 (23.5%) had expressed a preference for future work as a detective.

Materials

Inductive and Deductive Reasoning

As part of the recruitment process for the Police University College, participants had completed a cognitive aptitude test administered online by the international recruitment and assessment company Cut-e. The test is certified by Det Norske Veritas (DNS), according to the framework of the International Test Commission. Cut-e claim that their test predicts applicants’ predisposition for general logic problem-solving skills by testing a combination of inductive and deductive reasoning (Cut-e 2016). The subtask measuring inductive reasoning (scales ix) consists of a series of visual pattern arrays where the respondent’s task is to identify the rule that underlies the generation of patterns, and indicate which of the patterns do not conform to the rule. The task is timed (5 min) and respondents are presented with up to 20 different arrays. According to the test developers, the split-half reliability of the task is r = .86. The subtask measuring deductive reasoning (scales sx) consists of a series of visual shape–numerical operator combinations (learning items), and respondent’s task is to identify which of several alternative numerical operators is necessary to transform a specific set of visual shapes into a specified target state (test item). Again, the task is timed (5 min) and respondents are presented with up to seven different test items. For both inductive and deductive reasoning, accuracy scores were used as predictor variables in the analyses, representing the number of correct solutions controlling for the number of tasks attempted. All participants had given their consent for the researchers to access the results from the Cut-e tests.

Case Materials

Two vignettes (cases A and B), each the length of an A4 page, were used as stimulus materials. The vignettes were fictional, but were inspired by several actual missing-person cases in Norway and England. The same vignettes were used by Fahsing and Ask (2016), but the results from that study had not been publicised in any way prior to this study. The circumstances surrounding the cases were thoroughly masked or manipulated to prevent recognition of the cases and to allow for the insertion of investigative tipping-points. Extensive pretesting and modification of the material had been undertaken to make sure the vignettes allowed participants to generate several alternative hypotheses, while at the same time not requiring any highly specialised diagnostic knowledge. For instance, the missing persons could have been subject to murder, kidnapping, accident, sudden illness, suicide or a runaway, and were portrayed as females of mixed cultural background to invite speculation about crimes stereotypically associated with different cultures (Ask and Alison 2010; Innes 2003; Macpherson 1999).

Each vignette consisted of (1) a brief introduction to the background of the victim, their relationship with key persons in the case and potential motives for violence against the victim and (2) a longer section with preliminary findings in the investigation, including information obtained from witnesses, initial crime scene analyses and subsequent investigative actions. To manipulate the presence of a decisional tipping-point in one of the two cases, participants were informed that a key person in the case (the victim’s father in case A; the victim’s husband in case B) had been arrested. The order of the two cases and the location of the tipping-point (case A or case B) were randomised across participants.

Procedure

The experimental sessions were held in a quiet room at the participants’ learning institution. Participants were told to turn off their mobile phones, not to talk to each other and not to use any external aids while working with the materials. After being seated, participants received a booklet containing information about the research, an informed consent form, a demographics questionnaire, task instructions and the two crime vignettes. Participants were instructed to imagine that the events in the vignettes had taken place on the previous day, that they had just been put in charge of the investigation and that they had all the necessary operative resources at their disposal. They were given 30 min for each case to generate as many relevant hypotheses and investigative actions2 as possible. These were to be written down on two blank sheets, one for hypotheses and one for actions. To prevent fatigue, participants were allowed a 5-min break after the completion of the first vignette. After completing the entire booklet, participants were debriefed about the purpose of the research and thanked for their participation.

Data Preparation

To assess the quality of participants’ responses, they were gauged against a gold standard of relevant investigative hypotheses, which had been established by Fahsing and Ask (2016). An expert panel, recruited by Fahsing and Ask comprised 30 peer-recognised homicide-investigation experts in Norway and England. Through a collaborative, iterative process, the panel members completed an exhaustive list (for each of the two cases) of all the hypotheses that an officer in charge of the investigation should evaluate. For more details on the creation of the gold standard, see Fahsing and Ask (2016).

The gold standards for the two cases contained approximately the same number of hypotheses (9 vs. 11). All of the gold-standard hypotheses were competing and mutually exclusive. The missing woman could either have (1) run away, (2) been struck by accident or sudden illness, (3) committed suicide, (4) been killed by her family, (5) been killed by someone else she knew, (6) been killed by a stranger, (7) been kidnapped by her family, (8) been kidnapped by someone else she knows or (9) been kidnapped by a stranger. Because the missing woman in case B was married, another two hypotheses were added, namely that she could have been (10) killed by her husband or (11) kidnapped by her husband. A trained coder analysed participants’ responses and counted the number of the abovementioned gold-standard hypotheses. To make the measures comparable across cases, the proportion of reported gold-standard hypotheses out of the possible maximum (9 for case A; 11 for case B) was calculated for each case and used as dependent variables in subsequent analyses. None of the respondents reported any hypotheses over and above those included in the gold standard. In the original study by Fahsing and Ask (2016), the intraclass correlation coefficient (ICC) between two independent coders was .975 for the number of generated gold-standard hypotheses.

Results

Preliminary Analyses

Preliminary analyses were run to investigate whether participants’ gender, age, previous higher education or preference for future detective work was related to the proportion of generated gold-standard hypotheses. None of these relationships were statistically significant for any of the two cases, all ps > .403. Hence, these background variables were omitted in the analyses that follow.

Main Analyses

A 2 (case: A vs. B) × 2 (tipping-point location: case A vs. case B) mixed analysis of variance (ANOVA), with case as the within-participants factor, was performed on the percentage of generated gold-standard hypotheses. The analysis revealed a large effect of case, F(1, 164) = 197.19, p < .001, η p 2 = .546, indicating that participants generated a substantially higher percentage of gold-standard hypotheses in case A (M = 49.2, SD = 20.3, 95% CI [46.1, 52.3]) compared with case B (M = 29.5, SD = 10.8, 95% CI [27.9, 31.2]). There was no main effect of tipping-point location, F(1, 164) = 0.56, p = .454, η p 2 = .003. However, the case × tipping-point location interaction was statistically significant, F(1, 164) = 7.47, p = .007, η p2 = .044, showing that the percentage of generated hypotheses in each case depended on whether the tipping-point was located in that case or not (see Figure 1). The percentage of generated gold-standard hypotheses in case A was lower when the tipping-point was present in case A (M = 46.6, SD = 19.5, 95% CI [42.5, 50.8]) than when it was not (M = 52.0, SD = 20.9, 95% CI [47.3, 56.7]), although not significantly so, F(1, 164) = 2.99, p = .086, η p 2 = .018. Similarly, the percentage of generated gold-standard hypotheses in case B was lower when the tipping-point was present in Case B (M = 28.3, SD = 9.4, 95% CI [26.2, 30.4]) than when it was not (M = 30.6, SD = 11.9, 95% CI [28.1, 33.1]), but again not significantly so, F(1, 164) = 1.90, p = .170, η p 2 = .011.
 
Open image in new windowFig. 1
Fig. 1

Percentage of gold-standard hypotheses generated as a function of case and tipping-point (TP) presence

Given the major difference in the percentage of hypotheses generated between the two cases, the analyses examining the contribution of individual differences were performed separately for case A and case B, respectively. For each case, a hierarchical multiple regression analysis was conducted using the percentage of generated gold-standard hypotheses as the outcome variable. In step 1, participants’ inductive and deductive reasoning scores were entered as predictor variables. In step 2, a dummy-coded variable representing the tipping-point manipulation (1 = present, 0 = absent) was added. In step 3, two-way interaction terms involving all pairs of the predictor variables were entered. Finally, in step 4, a three-way interaction term involving all three predictors was entered. The inductive and deductive reasoning scores were mean centred before interaction terms were created to ease the interpretation of coefficients and to minimise collinearity in the model.

The results of the analysis for case A are presented in Table 1. As can be seen in step 1, participants’ inductive and deductive reasoning ability did not explain any of the variance in the generation of gold-standard hypotheses, R 2 = .001. The addition of the tipping-point manipulation in step 2 resulted in a small increase in explained variance, ΔR 2 = .018, but the predictive value of tipping-point presence fell short of statistical significance (p = .086). The addition of two-way (ΔR 2 = .006) and three-way (ΔR 2 = .003) interactions in steps 3 and 4, respectively, did not add any meaningful portions of explained variance. Hence, while there was a tendency that the presence of the tipping-point in case A reduced the number of generated gold-standard hypotheses (mirroring the ANOVA reported above), there was no indication that individual differences in inductive and deductive reasoning moderated this influence.
Table 1

Hierarchical multiple regression analysis predicting the percentage of generated gold-standard hypotheses from inductive reasoning ability, deductive reasoning ability and tipping-point presence (case A)

Predictor

b

95% CI

SE b

β

p

Step 1 (R 2 = .001)

 Inductive reasoning

0.04

[−0.27, 0.35]

0.16

0.02

.784

 Deductive reasoning

0.03

[−0.26, 0.32]

0.15

0.02

.845

Step 2 (ΔR 2 = .018)

 Inductive reasoning

0.04

[−0.27, 0.35]

0.16

0.02

.802

 Deductive reasoning

0.04

[−0.25, 0.33]

0.15

0.02

.777

 Tipping-pointa

−5.46

[−11.69, 0.77]

3.16

−0.13

.086

Step 3 (ΔR 2 = .006)

 Inductive reasoning (IND)

0.08

[−0.36, 0.52]

0.22

0.04

.712

 Deductive reasoning (DED)

0.17

[−0.34, 0.69]

0.26

0.09

.510

 Tipping-point (TP)

−5.60

[−11.88, 0.68]

3.18

−0.14

.080

 IND × DED

−0.01

[−0.04, 0.02]

0.01

−0.06

.484

 IND × TP

−0.12

[−0.75, 0.51]

0.32

−0.04

.711

 DED × TP

−0.14

[−0.76, 0.47]

0.31

−0.06

.646

Step 4 (ΔR 2 = .003)

 Inductive reasoning (IND)

0.12

[−0.33, 0.58]

0.23

0.06

.599

 Deductive reasoning (DED)

0.12

[−0.43, 0.66]

0.27

0.06

.673

 Tipping-point (TP)

−5.44

[−11.76, 0.87]

3.20

−0.13

.090

 IND × DED

0.00

[−0.04, 0.05]

0.02

0.01

.938

 IND × TP

−0.15

[−0.78, 0.49]

0.32

−0.05

.651

 DED × TP

−0.06

[−0.72, 0.60]

0.33

−0.02

.866

 IND × DED × TP

−0.02

[−0.08, 0.04]

0.03

−0.09

.478

N = 166. Final model R 2 = .028, F(7, 158) = 0.65, p = .716

a1 = tipping-point present, 0 = tipping-point absent

The results of the analysis for case B are presented in Table 2. Again, individual differences in inductive and deductive reasoning were unrelated to the generation of gold-standard hypotheses, R 2 = .002. The addition of the tipping-point manipulation in step 2 caused a very small increase in explained variance, ΔR 2 = .011, and in contrast to case A, the regression coefficient for tipping-point did not approach statistical significance (p = .172). The introduction of two-way interactions at step 3 increased explained variance by about 2% (ΔR2 = .019), but none of the individual interaction terms were statistically significant (all ps ≥ .160). Finally, the three-way interaction term did not contribute to the regression model in Step 4, ΔR 2 = .003.
Table 2

Hierarchical multiple regression analysis predicting the percentage of generated gold-standard hypotheses from inductive reasoning ability, deductive reasoning ability and tipping-point presence (case B)

Predictor

b

95% CI

SE b

β

p

Step 1 (R 2 = .002)

 Inductive reasoning

0.04

[−0.13, 0.20]

0.08

0.04

.645

 Deductive reasoning

0.01

[−0.14, 0.17]

0.08

0.01

.868

Step 2 (ΔR 2 = .011)

 Inductive reasoning

0.04

[−0.12, 0.20]

0.08

0.04

.630

 Deductive reasoning

0.01

[−0.15, 0.16]

0.08

0.01

.922

 Tipping-pointa

−2.31

[−5.64, 1.02]

1.69

−0.11

.172

Step 3 (ΔR 2 = .019)

 Inductive reasoning (IND)

−0.04

[−0.28, 0.19]

0.12

−0.04

.721

 Deductive reasoning (DED)

−0.06

[−0.26, 0.13]

0.10

−0.06

.529

 Tipping-point (TP)

−2.23

[−5.56, 1.10]

1.69

−0.10

.188

 IND × DED

0.00

[−0.02, 0.01]

0.01

−0.03

.715

 IND × TP

0.14

[−0.19, 0.48]

0.17

0.10

.399

 DED × TP

0.23

[−0.09, 0.56]

0.16

0.14

.160

Step 4 (ΔR 2 = .003)

 Inductive reasoning (IND)

−0.05

[−0.29, 0.19]

0.12

−0.05

.689

 Deductive reasoning (DED)

−0.08

[−0.28, 0.12]

0.10

−0.08

.451

 Tipping-point (TP)

−2.16

[−5.50, 1.19]

1.69

−0.10

.205

 IND × DED

0.00

[−0.02, 0.02]

0.01

0.02

.884

 IND × TP

0.13

[−0.21, 0.47]

0.17

0.09

.448

 DED × TP

0.27

[−0.08, 0.63]

0.18

0.16

.124

 IND × DED × TP

−0.01

[−0.04, 0.02]

0.02

−0.07

.520

N = 166. Final model R 2 = .035, F(7, 158) = 0.81, p = .581

a1 = tipping-point present, 0 = tipping-point absent

Additional Analyses

We also examined how the frequencies of generated hypotheses were distributed within each case. In case A involving a missing Kurdish girl, there were only ambiguous circumstances pointing towards the girl’s father (see Fig. 2). Nevertheless, participants were clearly biased towards hypotheses involving the abduction or killing of the girl by family members, and were markedly less likely to consider non-criminal hypotheses (e.g. accident/illness, suicide). As mentioned above, case B was structurally similar to case A, but with a married Chinese woman gone missing (see Fig. 3). The vignette contained information about the discovery of a dead body of an Asian woman in the vicinity of the missing person’s home. Despite the fact that the identity of the body had not yet been established, participants displayed an even stronger bias than in case A towards hypotheses involving homicide, at the expense of alternative explanations. To quantify the amount of “crime bias” in participants’ responses, the proportion of criminal hypotheses out of the possible maximum (14), as well as the proportion of non-criminal hypotheses out of the possible maximum (6), was computed for each participant collapsed across the two cases. A dependent t test revealed that participants generated a much higher proportion of the criminal hypotheses (M = 46.7%, SD = 13.8, 95% CI [44.6, 48.8]) compared with the non-criminal hypotheses (M = 19.0%, SD = 19.1, 95% CI [16.0, 21.9]), t(165) = 18.37, p < .001, Hedge’s g av  = 1.68.
 
Open image in new windowFig. 2
Fig. 2

Proportion of participants who reported each of the gold-standard hypotheses in case A (involving a missing Kurdish girl)

 
Open image in new windowFig. 3
Fig. 3

Proportion of participants who reported each of the gold-standard hypotheses in case B (involving a missing Chinese woman)

Discussion

Our findings suggest that the tests of inductive and deductive reasoning ability used for selection and recruitment to the Norwegian Police do not successfully predict police students’ ability to generate hypotheses in criminal investigative scenarios. The lack of a relationship between indicators of an abductive problem-solving style and the cognitive ability test may be due to the fact that these forms of reasoning are characterised by distinct qualitative differences. Deductive reasoning applies rules in order to work out what happens in specific cases and why it happens. Broadly speaking, it is therefore a confirmatory exercise (Staat 1993). Inductive and abductive reasoning are more closely related since they both apply knowledge and rules to seek new explanations. Inductive reasoning, however, is not specifically about seeking new information, but about making predictions based on known information. Abductive reasoning does not have this restriction; it allows for any hypothesis, even pure guesswork, as long as it is consistent with the data and has the potential to add more explanatory power than the competing hypotheses (Lipton 2007). This makes abductive logic inherently different from induction and deduction because it is exploratory, and not confirmatory, in nature. Hence, the lack of a correlation between abductive thinking and a typical intelligence test-styled task is perhaps not surprising. This null finding adds to the debate on the feasibility and utility of traditional cognitive aptitude testing in the context of law-enforcement recruitment (Aamodt 2004). It also raises questions as to which aspect(s) of cognitive aptitude, if any, is most predictive of detective performance in complex crimes or other complex problem-solving task.

The fact that the participants were strongly biased towards murder and crime-congruent hypotheses in both of the investigative scenarios is noteworthy. Participants were quite strongly biased towards hypotheses implying criminal explanations (e.g. murder, kidnapping) in both of the cases. In fact, over a third of the participants did not generate any non-criminal hypotheses in any of the cases. This taps well in with previous research in the field. In a series of studies, Ask et al. (2008) and Marksteiner et al. (2011) found that police trainees and law students displayed asymmetrical scepticism and substantial elasticity in their interpretation of the available evidence depending on the perceived strength of the evidence and how well it fitted with an initial hypothesis of guilt. Similarly, Eerland and Rassin (2012) found that the presence of information regarding potentially serious crimes suppressed explanations which should have been considered given the absence of certain cues and information. For instance, the absent information in the two vignettes implicitly told participants that the persons gone missing had neither been found, identified, or declared dead. This should prompt hypotheses assuming kidnapping and all the non-crime alternatives ranging from voluntary runaway to suicide. Ideally, then, present and absent information should be considered equally diagnostic. In a real-life investigation, such a complete lack of non-criminal hypotheses would clearly hamper the fact-finding process, seriously increase the chance of tunnel vision and guilt bias (see e.g. Kassin et al. 2003; Marksteiner et al. 2011), and reduce the chance of finding a missing person alive. Statistically, runaways, accidents, and suicide occur much more often than do murders in Norway (e.g. less than one person is murdered for every 25 person reported missing; KRIPOS 2014). Unfortunately, our findings correspond with a number of previous studies and observations of crime and guilt biases among law-enforcement personnel (Brodeur 2010; Innes 2003; Meissner and Kassin 2002; Packer 1968). It is likely that some participants understood the task as one focused on identifying a main suspect. This perception may have been activated by an understanding among the students that detecting a crime (vs. detecting a non-crime) is more in line with traditional police culture and typical efficiency goals promoted in the criminal justice process. However, speaking against this possibility is the fact that the participants were explicitly instructed to identify all competing explanations. Interestingly, the tendency to focus on criminal explanations was found also in the previous study by Fahsing and Ask (2016), using the same materials, but among police officers with more experience. This indicates that abductive thinking—reasoning to the best explanation—is difficult even under low-stress conditions and when open-minded thinking is explicitly encouraged.

The results did not consistently support our prediction that the decision to arrest a suspect would act as an investigative tipping-point. When looking at the two cases in conjunction, there was a significant pattern such that the number of hypotheses generated was a function of the location of the tipping-point across the two cases. When examining each case individually, however, participants did not generate significantly fewer relevant investigative hypotheses in the case when an arrest had (vs. had not) been made. This null finding may accurately reflect that strategic decisions, such as making an arrest, exert little influence on investigators’ hypothesis generation and testing. On the other hand, the number of generated hypotheses was lower in both the cases when the tipping-point was present. The reason why this trend was more pronounced in case A than in case B is not entirely clear, but the fact that the average number of generated hypotheses differed markedly between the two cases might provide a clue. On average, participants produced more than 50% of the gold-standard hypotheses in case A with no tipping-point present. This allows for a substantial impact of a decisional tipping-point. In case B, in contrast, the average hypotheses-generation rate was just above 30%. This low baseline may not have allowed for much further reduction when the tipping-point was introduced (i.e. a floor effect). Practically speaking, the fact that case B gave information about the discovery of a female body probably prompted many participants to commit to a hypotheses involving homicide. The information given in the tipping-point (arrest of potential killer) was consistent with a hypothesis of murder and therefore did not alter the hypotheses considered. Importantly, however, as long as the body had not been identified as the missing woman, all non-crime hypotheses should be kept open and active in the investigation.

The Norwegian police recruits in the current study performed quite well on the diagnostic reasoning task. Compared with the previous study by Fahsing and Ask (2016), using the same cases, they performed better than English police officers with more than 2 years of police experience and almost as well as highly experienced homicide detectives from Norway. Moreover, about 7% of the participants in the current study generated more than 65% of the gold-standard hypotheses, which is on par with the performance of highly specialised Senior Investigative Officers in the previous study by Fahsing and Ask. This exceptionally high performance by beginners, lacking both formal training and experience, indicates that our pursuit of early predictors of core investigative abilities is worthwhile.

A couple of limitations of the current research should be noted. The first one concerns the representativeness of our experimental task for criminal detectives’ actual work. Admittedly, the generation and testing of investigative hypotheses represents only one of many skills required to becoming an expert detective (Fahsing and Gottschalk 2008; O’Neill and Milne 2014; Smith and Flanagan 2000). Our results are not informative about other crucial skills, such as communication, cooperation and coordination. There appears to be strong consensus among policing experts, however, that the ability to generate and evaluate alternative explanations is one of the core defining features of an expert detective (ACPO 2012; Cook and Tattersall 2008; Smith and Flanagan 2000; Stelfox and Pease 2005). The relevant literature suggests that an adequate generation of relevant hypotheses is beneficial to the outcome of actual criminal investigations and reduces the risk of bias (Alison et al. 2013; Ask 2006; Macquet 2009 ; Simon 2012). Hence, despite its necessarily limited scope, the current study addresses a skill considered to contribute strongly to the success of a criminal investigation and should therefore perhaps be more systematically tested for in the recruitment of future detectives.

A second limitation concerns the measures of inductive and deductive reasoning that were used as predictor variables in the current study. On might question whether the type of reasoning assessed by those measures (sense-making based on visual patterns and numerical operators) displays sufficient similarities to the more contextualised reasoning assessed by our investigative task for any meaningful relationships to appear. Indeed, inductive and deductive reasoning abilities may have been more predictive had they been measured using contextualised, verbal reasoning tasks. On the other hand, the current tasks followed the same logic and format as many established measures of general cognitive ability (e.g. Raven’s progressive matrices; Domino and Domino 2006) which have been shown to correlate with a range of real-world abilities (Hunter 1986). It is also possible that the tests used in this study do not adequately reflect the intended underlying constructs (inductive and deductive logic). To our knowledge, there are no published validation studies available in the scientific literature, and the test developers refer to their own validation studies as a warrant of the test’s reliability and validity (Cut-e 2016). Hence, before reliable conclusions can be drawn regarding the relationships between the variables of concern, the current findings may need to be replicated in studies using more extensively validated measures of inductive and deductive reasoning.

In conclusion, the current research provides an initial attempt to identify early predictors of police officers’ aptitude for diagnostic reasoning in investigative scenarios. While none of the hypothesised predictors received empirical support, the large variability in participants’ performance, in conjunction with the exceptionally high performance of some individuals, suggests that the quest for reliable predictors is a fruitful avenue for future research. The influence of investigative tipping-points on detectives’ diagnostic reasoning, while somewhat ambiguous and currently poorly understood, should be followed up by future research using a larger variation in methods using even more realistic stimulus materials. Ideally, this should be tested in a full-scale simulator based on data and experiences from real-life criminal cases. A final take-home message concerns the clear preference for criminal (vs. non-criminal) hypotheses reflected in police students’ responses. This indicates that the “crime bias”, previously documented among both novice police officers and highly specialised detectives, is deeply rooted and present even at the very start of police officers’ careers.

Footnotes

  1. 1.

    Beyond reasonable doubt is the highest standard of burden of proof in Anglo-American jurisprudence and typically only applies in criminal proceedings. It has been described, in negative terms, as a proof having been met if there is no plausible reason to believe otherwise. However, it does not mean an absolute certainty. The standard that must be met by the prosecution’s evidence in a criminal prosecution is that no other logical explanation can be derived from the facts except that the defendant committed the crime, thereby overcoming the presumption that a person is innocent unless and until proven guilty (Jackson 1988)

  2. 2.

    In keeping with the procedure of Fahsing and Ask (2016), participants were asked to list both hypotheses and investigative actions. Because participants in the current study lacked investigative experience and had not yet received training in criminal investigation, however, their generation of investigative actions was deemed to be of little relevance. Hence, only participants’ generated hypotheses will be reported in this paper. The data for investigative actions can be requested from the corresponding author.


Powered by huaxindc.com Inc.Copyright © 2002-2017 HUAXIN. Detective Agency in China

Address:Xinhua Airlines building.,The East Third Ring Road,Chaoyang District,Beijing,China

Postal code:100071 Tel: +86 153-2191-0511 Email: info@huaxindc.com